Search CORE

289 research outputs found

On Some Geometric Behavior of Value Iteration on the Orthant: Switching System Perspective

Author: Lee Donghwan
Publication venue
Publication date: 18/08/2023
Field of study

In this paper, the primary goal is to offer additional insights into the value iteration through the lens of switching system models in the control community. These models establish a connection between value iteration and switching system theory and reveal additional geometric behaviors of value iteration in solving discounted Markov decision problems. Specifically, the main contributions of this paper are twofold: 1) We provide a switching system model of value iteration and, based on it, offer a different proof for the contraction property of the value iteration. 2) Furthermore, from the additional insights, new geometric behaviors of value iteration are proven when the initial iterate lies in a special region. We anticipate that the proposed perspectives might have the potential to be a useful tool, applicable in various settings. Therefore, further development of these methods could be a valuable avenue for future research

arXiv.org e-Print Archive

The Dynamics of Productivity Changes in Agricultural Sector of Transition Countries

Author: An Donghwan
Kim Hanho
Lee Sangjun
Publication venue
Publication date
Field of study

Relying on frontier production approach (e.g., Luenberger's shortage function), we investigated the performance of agricultural sector in transition countries and its changes over time, especially focusing on the dynamics of productivity changes. We found that; (i) CEE countries have improved their performance during the sample period whereas CIS have not; (ii) productivity changes in the last decade was attributable to the technical progress; (iii) overall performance was decelerated for the second 5-year sub-period (1997-2001) in both regions; (iv) agricultural reform has positive effects on the productivity and its components especially in CEE countries.transition countries, productivity, directional distance function, agricultural reform, Productivity Analysis,

Research Papers in Economics

Suppressing Overestimation in Q-Learning through Adversarial Behaviors

Author: Lee Donghwan
Lee HyeAnn
Publication venue
Publication date: 09/10/2023
Field of study

The goal of this paper is to propose a new Q-learning algorithm with a dummy adversarial player, which is called dummy adversarial Q-learning (DAQ), that can effectively regulate the overestimation bias in standard Q-learning. With the dummy player, the learning can be formulated as a two-player zero-sum game. The proposed DAQ unifies several Q-learning variations to control overestimation biases, such as maxmin Q-learning and minmax Q-learning (proposed in this paper) in a single framework. The proposed DAQ is a simple but effective way to suppress the overestimation bias thourgh dummy adversarial behaviors and can be easily applied to off-the-shelf reinforcement learning algorithms to improve the performances. A finite-time convergence of DAQ is analyzed from an integrated perspective by adapting an adversarial Q-learning. The performance of the suggested DAQ is empirically demonstrated under various benchmark environments

arXiv.org e-Print Archive